Let’s Begin with a general time-series plot showing bike ridership over the range of dates for which we have data.


I’ve used the dygraphs package for the time-series plot as it creates no-nonsense interactive time-series plots with very little overhead for the analyst. Once you are familiar with creating the kind of time-series (xts or something that can be coerced to xts) object for the dygraphs() function then there is little else you need for a quality plot.

Seasonal effects are evident in many time-series trends and we would expect such an effect to be particularly strong with this type of activity - and the plot shows this to be true. The peak summer months see 3-4x more riders than the winter counterparts.

Closer look at seasonal differences


Let’s get some actual numbers for this seasonal difference. If we take October - March as ‘cold rainy’ months in Seattle and then April - September as ‘warm sunny’ months we can get some exact figures on the discrepancy of ridership between these two periods.

This is just an alternative way to see the gulf of difference between rider numbers in the winter versus summer months. The data wrangling to get this plot allows us to access the specific numbers though and therefore work with them.

Also, it may be a reasonable hypothesis to suggest that those who ride in the winter months are doing so out of necessity more often than not and therefore present a baseline of those who are committing to using their bicycle to get to-and-from their workplace. The summer months then brings perhaps some people who are willing to cycle only in those conditions + the large number of people who engage with cycling as a hobby when the weather is good. And this plot perhaps more clearly shows that 2018 showed a stop in the decline of rider numbers in the winter months with a marked increase where as the same year showed a significant drop in the summer months. Therefore there is potentially some positivitity to be derived in terms of those riders who commit to the cycling commute if not in terms of overall numbers more generally.

Winter ridership as a proportion of summer ridership


For the five years of which we have ‘complete’ data these gauges show the proportion of ridership in the cold winter months as a function of the ridership in the nice summer months.

I just really like how clean look in flexdashboard so threw these in despite it being a pretty silly way to show the this changing proportion over time.

Storyboard layout in flexdashboard is built to display one component per page so you need to be a little bit creative to get more than one component into the page. In this case I just needed to put each gauge() call into it’s own <div> element which can be done in a variety of ways e.g. HTML() or tags$div from the shiny package.

Bike count by day/season


Hovering over the points will give the exact values. We do this by using the ggiraph package which allows the addition of interactive elements to ggplot objects meaning I can style the plot exactly how I like with my usual ggplot tricks and then add some interactivity on top. Not all that dissimilar from using ggplotly to wrap around your ggplot object but I’m liking the ggiraph method a little more for basic interactivity at the moment.

By day/year/season


It’s somewhat interesting that the 2018 data shows such a huge drop-off on all days in the summer months with the exception of Wednesday. This is likely where some domain-specific (or local!) knowledge might be handy to have ideas about why this may be the case. My first thought was that it could be a data logging issue given these counters may just not work at times but that doesn’t explain why the malfunctions (whatever they may be) wouldn’t be common in summer 2018 except on Wednesdays….

Basic Forecasting - Using Prophet


After wrangling the data to total rider count per day across all crossings I then used the prophet package to produce a simple forecast for rider numbers moving forward for the next two years - this take into account daily trends and seasonal trends which we have seen to be of significant impact on these data.

I was going to flex some ggplot muscle to create a plot to display the results but then I realised that the package includes an in-built function to create a dygraph of the data so it would be crazy not to use that.

The plot shows the actual data points up to Feb 2019 when the data ends along with the smoothed trend line fit by prophet. The trend function then continues two years into the future providing daily predictions for rider numbers. As with the previous dygraphs in this storyboard you can hover over a particular date and see specific predictions.

Conclusion - Rating Storyboards from flexdashboard out of 10.

10

Despite having to find workarounds for more than one component per page it’s still a 10/10. At the end of the day the purpose is for clarity and one component per page makes 100% sense, it’s just the fact this was a ‘toy’ example that I wanted to experiment with the gauges. The fact you can also implement storyboard elements inside one particular page is something I am keen to explore in the future.